-
Notifications
You must be signed in to change notification settings - Fork 920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix synchronization issue when writing string columns with dictionary to ORC #14595
Fix synchronization issue when writing string columns with dictionary to ORC #14595
Conversation
@abellina Would you please confirm that disabling the ORC dict sorting addresses the query error? I believe this confirmation should be prerequisite to merging. |
@abellina and I verified that disabling the dictionary sorting did not resolve the issue. We're currently digging into other parts of the associated PR that were not specific to the sorted dictionary path. |
This reverts commit 2f24822.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me.
Co-authored-by: Bradley Dice <[email protected]>
… to ORC (rapidsai#14595) Changes in rapidsai#14295 introduced a synchronization issue in `build_dictionaries`. After stripe_dicts are initialized on the host, we copy them to the device and then launch kernels that read the dicts (device copy). However, after these kernels we deallocate buffers that are not longer needed and clear the dicts' views to these buffers on the host. The problem is that, without synchronization after the H2D copy, the host modification can be done before the H2D copy is performed, and we run the kernels with the altered state. This PR adds a sync point to make sure the copy is done before host-side modification. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Nghia Truong (https://github.com/ttnghia) - Alessandro Bellina (https://github.com/abellina) - Bradley Dice (https://github.com/bdice)
Description
Changes in #14295 introduced a synchronization issue in
build_dictionaries
. After stripe_dicts are initialized on the host, we copy them to the device and then launch kernels that read the dicts (device copy). However, after these kernels we deallocate buffers that are not longer needed and clear the dicts' views to these buffers on the host. The problem is that, without synchronization after the H2D copy, the host modification can be done before the H2D copy is performed, and we run the kernels with the altered state.This PR adds a sync point to make sure the copy is done before host-side modification.
Checklist